Is intelligibility still the main problem? a review of perceptual quality dimensions of synthetic speech

نویسندگان

  • Florian Hinterleitner
  • Christoph Norrenbrock
  • Sebastian Möller
چکیده

In this paper, we present a comparative overview of 9 studies on perceptual quality dimensions of synthetic speech. Different subjective assessment techniques have been used to evaluate the text-to-speech (TTS) stimuli in each of these tests: in a semantic differential, the test participants rate every stimulus on a given set of rating scales, while in a paired comparison test, the subjects rate the similarity of pairs of stimuli. Perceptual quality dimensions can be derived from the results of both test methods, either by performing a factor analysis or via multidimensional scaling. We show that even though the 9 tests differ in terms of used synthesizer types, stimulus duration, language, and quality assessment methods, the resulting perceptual quality dimensions can be linked to 5 universal quality dimensions of synthetic speech: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) disturbances, and (v) calmness.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech difficulties in Joubert syndrome

Introduction: "Joubert syndrome" was first introduced in1969. This syndrome is a rare genetic disease with autosomal dominantpattern. Hypotonia, ataxia and motor delay of the disease known as clinical manifestations. In the few reports of this syndrome, mostly functional and structural components studied and radiographic images such as speech and language developmental delay symptoms has been l...

متن کامل

Study on Unit-Selection and Statistical Parametric Speech Synthesis Techniques

One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...

متن کامل

Assessing the Intelligibility and Quality of HMM-based Speech Synthesis with a Variable Degree of Articulation

This paper focuses on the assessment of both the intelligibility and the quality of speech when using a variable degree of articulation (hypo/hyperarticulation) in the framework of HMM-based speech synthesis. Intelligibility is evaluated when the synthesizer is working in adverse conditions. The adaptation of a neutral speech synthesizer to generate hypo and hyperarticulated speech is first per...

متن کامل

Effects of intelligibility on working memory demand for speech perception.

Understanding low-intelligibility speech is effortful. In three experiments, we examined the effects of intelligibility on working memory (WM) demands imposed by perception of synthetic speech. In all three experiments, a primary speeded word recognition task was paired with a secondary WM-load task designed to vary the availability of WM capacity during speech perception. Speech intelligibilit...

متن کامل

Diagnostic Evaluation of Synthetic Speech Using Speech Recognition

The paper presents experiments on the use of automatic speech recognition for diagnostic evaluation of synthetic speech. Our previous work on the topic showed a strong correlation between the subjective and objective evaluation (ITU-T Rec. P.862 PESQ) of the quality of synthetic speech. The main drawback of the approach was the need for original human (reference) recordings in one to one mappin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013